Paleopolyploidy refers to ancient genome duplications which occurred at least several million years ago (mya). The genome doubling event could either be an autopolyploidy or an allopolyploidy. Due to functional redundancy, genes are rapidly silenced and/or lost from the duplicated genomes. Most paleopolyploids, through evolutionary time, have lost their polyploid status through a process called diploidization, and are currently referred to as "diploids" (e.g. baker's yeast, Arabidopsis, and perhaps humans).
Paleopolyploidy is extensively studied in plant lineages. It has been found that almost all flowering plants have undergone at least one round of genome duplication at some point during their evolutionary history. Ancient genome duplications are also found in the early ancestor of vertebrates (which includes the human lineage) and another near the origin of the bony fishes. Evidence suggests that baker's yeast (Saccharomyces cerevisiae), which has a compact genome, experienced polyploidization during its evolutionary history.
Contents |
Ancient genome duplications are widespread throughout eukaryotic lineages, particularly in plants. Studies suggest that the common ancestor of Poaceae, the grass family which includes important crop species such as maize, rice, wheat, and sugar cane, had a genome duplication 50–70 mya[1]. Further independent whole genome duplications have occurred in the lineages leading to maize, sugar cane and wheat, but not rice.
The core eudicots -- rosids and asterids -- also share a common whole genome duplication and many species within these groups have experiences additional whole genome duplications. For example, the model plant Arabidopsis thaliana, the first plant to be have its entire genome sequenced, experienced at least two additional rounds of whole genome duplication since the duplication shared by the core eudicots[2]. The most recent event took place before the divergence of the Arabidopsis and Brassica lineages, 25–40 mya.
Compared with plants, paleopolyploidy is much rarer in the animal kingdom. It has been identified mainly in amphibians and bony fishes. Although some studies suggested one or more common genome duplications are shared by all vertebrates (including humans), the evidence is not as strong as in the other cases because the duplications, if they exist, happened so long ago, and it is still under debate. The idea that vertebrates share a common whole genome duplication is known as the 2R Hypothesis. Many researchers are interested in the reason why animal lineages, particularly mammals, have had so many fewer whole genome duplications than plant lineages.
A well-supported paleopolyploidy has been found in baker's yeast (Saccharomyces cerevisiae), despite its small, compact genome (~13Mbp) after the divergence from K. waltii.[3] Through genome streamlining, yeast has lost 90% of the duplicated genome over evolutionary time and is now recognized as a diploid organism.
Duplicated genes can be identified through sequence homology on the DNA or protein level. Paleopolyploidy can be identified as massive gene duplication at one time using a molecular clock. To distinguish between whole-genome duplication and a collection of single gene duplication (which is a common phenomenon in the genome) events, the following rules are often applied:
In theory, the two duplicated genes should have the same "age"; that is, the divergence of the sequence should be equal between the two genes duplicated by paleopolyploidy (homeologs). Synonymous substitution rate, Ks, is often used as a molecular clock to determine the time of gene duplication. Thus, paleopolyploidy is identified as a "peak" on the duplicate number vs. Ks graph (shown on the right).
Duplication events that occurred a long time ago in the history of various evolutionary lineages can be difficult to detect because of subsequent diploidization (such that a polyploid starts to behave cytogenetically as a diploid over time) as mutations and gene translations gradually make one copy of each chromosome unlike its counterpart. This usually results in a low confidence for identifying a very ancient paleopolyploidy.
Paleopolyploidization events lead to massive cellular changes, including doubling of the genetic material, changes in gene expression and increased cell size. Gene loss during diploidization is not completely random, but heavily selected. Genes from large gene families are duplicated. On the other hand, individual genes are not duplicated. Overall, paleopolyploidy can have both short-term and long-term evolutionary effects on an organism's fitness in the natural environment.
The hypothesis of human paleopolyploidy originated as early as the 1970s, proposed by the biologist Susumu Ohno. He reasoned that the vertebrate genome could not achieve its complexity without large scale whole-genome duplications. The "two rounds of genome duplication" hypothesis (2R hypothesis) came about, and gained in popularity, especially among developmental biologists.
However, the 2R hypothesis has been questioned by many researchers. Based on the theory, the human genome should have a 4:1 gene ratio compared with invertebrate genomes. This is not supported by findings from the 48 vertebrate genome projects available in mid-2011, for example the human genome consists of ~21,000 protein coding genes according to June, 2011 counts at UCSC and Ensembl genome analysis centers while an average invertebrate genome size is about 15,000 genes. Further, the recent completion of the amphioxus genome sequence does not support any such whole genome duplication with largescale retention, as predicted by the hypothesis.[4] Additional arguments against 2R were based on the lack of the (AB)(CD) tree topology amongst four members of a gene family in vertebrates. However, if the two genome duplications occurred close together, we would not expect to find this topology.[5]
These recent findings have largely supported the 2R hypothesis.
|
|
|